On Negative Examples for Distantly-supervised Relation Extraction
نویسنده
چکیده
A careful reader of our recent EMNLP paper (Surdeanu et al., 2012) will observe a somewhat untraditional notation: we use Pi to denote the known positive labels for an entity tuple i, and Ni to denote the set of known negative labels for the same tuple, that is, labels for which tuple i serves as a negative example (introduced in Section 4). This is uncommon: traditionally at training time, an entity tuple (e1, e2) that does not exist in the training DB is considered a negative example for all possible labels, which makes the maintenance of an explicit set Ni unnecessary. Most previous work on this topic, including our EMNLP 2012 paper, used this heuristic. However, in the context of the KBP slot filling task, where most infoboxes provided as training data are incomplete, this heuristic is not ideal. For example, let’s assume that we have an incomplete infobox for Rachmaninoff with a single slot (person:country of birth, Russia), and during training we see the tuple (Rachmaninoff, United States). What label should we assign to this tuple? According to the above heuristic, this tuple is a negative example for all the valid labels for PERSON entities, including person country of death. But this is wrong: in fact, Rachmaninoff died in the United States. We just did not have this information in the corresponding infobox. A better heuristics, which, to my knowledge, was discovered at the same time by (Sun et al., 2011) and (Surdeanu et al., 2011), is to consider the tuple a negative example only for the labels which exist in ei’s infobox with a different label. More formally, using our notation, Ni for the ith tuple (e1, e2) is defined as: {rj | rj(e1, ek) ∈ D, ek 6= e2, rj / ∈ Pi}. That is, for the above example, (Rachmaninoff, United States) should be considered as negative example only for person:country of birth because this is the only thing we know with certainty about Rachmaninoff in our training dataset. Modeling this heuristic using local, one-vs-rest classifiers is trivial, as illustrated by both (Sun et al., 2011) and (Surdeanu et al., 2011): you just create different negative example sets for each label, according to the data available in the infoboxes. However, when dealing with a joint model, implementing this is not exactly trivial.1 To pat ourselves on the back, we designed our EMNLP algorithm from the very beginning under the assumption that one needs to maintainNi explicitly, although we left the empirical analysis of the heuristic discussed here as future work. So our algorithm works as is with this new heuristic (you just create Ni differently).
منابع مشابه
Bootstrapping Distantly Supervised IE Using Joint Learning and Small Well-Structured Corpora
We propose a framework to improve the performance of distantly-supervised relation extraction, by jointly learning to solve two related tasks: concept-instance extraction and relation extraction. We further extend this framework to make a novel use of document structure: in some small, wellstructured corpora, sections can be identified that correspond to relation arguments, and distantly-labele...
متن کاملCombining Distant and Partial Supervision for Relation Extraction
Broad-coverage relation extraction either requires expensive supervised training data, or suffers from drawbacks inherent to distant supervision. We present an approach for providing partial supervision to a distantly supervised relation extractor using a small number of carefully selected examples. We compare against established active learning criteria and propose a novel criterion to sample ...
متن کاملImproving distant supervision using inference learning
Distant supervision is a widely applied approach to automatic training of relation extraction systems and has the advantage that it can generate large amounts of labelled data with minimal effort. However, this data may contain errors and consequently systems trained using distant supervision tend not to perform as well as those based on manually labelled data. This work proposes a novel method...
متن کاملDistant Supervision for Relation Extraction with an Incomplete Knowledge Base
Distant supervision, heuristically labeling a corpus using a knowledge base, has emerged as a popular choice for training relation extractors. In this paper, we show that a significant number of “negative“ examples generated by the labeling process are false negatives because the knowledge base is incomplete. Therefore the heuristic for generating negative examples has a serious flaw. Building ...
متن کاملDeep Residual Learning for Weakly-Supervised Relation Extraction
Deep residual learning (ResNet) (He et al., 2016) is a new method for training very deep neural networks using identity mapping for shortcut connections. ResNet has won the ImageNet ILSVRC 2015 classification task, and achieved state-of-theart performances in many computer vision tasks. However, the effect of residual learning on noisy natural language processing tasks is still not well underst...
متن کاملApplying UMLS for Distantly Supervised Relation Detection
This paper describes first results using the Unified Medical Language System (UMLS) for distantly supervised relation extraction. UMLS is a large knowledge base which contains information about millions of medical concepts and relations between them. Our approach is evaluated using existing relation extraction data sets that contain relations that are similar to some of those in UMLS.
متن کامل